Pitfalls in the use of Parallel Inference for the Dirichlet Process

نویسندگان

  • Yarin Gal
  • Zoubin Ghahramani
چکیده

Recent work done by Lovell, Adams, and Mansingka (2012) and Williamson, Dubey, and Xing (2013) has suggested an alternative parametrisation for the Dirichlet process in order to derive non-approximate parallel MCMC inference for it – work which has been picked-up and implemented in several different fields. In this paper we show that the approach suggested is impractical due to an extremely unbalanced distribution of the data. We characterise the requirements of efficient parallel inference for the Dirichlet process and show that the proposed inference fails most of these requirements (while approximate approaches often satisfy most of them). We present both theoretical and experimental evidence, analysing the load balance for the inference and showing that it is independent of the size of the dataset and the number of nodes available in the parallel implementation. We end with suggestions of alternative paths of research for efficient non-approximate parallel inference for the Dirichlet process.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analytical D’Alembert Series Solution for Multi-Layered One-Dimensional Elastic Wave Propagation with the Use of General Dirichlet Series

A general initial-boundary value problem of one-dimensional transient wave propagation in a multi-layered elastic medium due to arbitrary boundary or interface excitations (either prescribed tractions or displacements) is considered. Laplace transformation technique is utilised and the Laplace transform inversion is facilitated via an unconventional method, where the expansion of complex-valued...

متن کامل

Perception of fear and adoption of risk control for hookah use among male students: using the extended parallel process model

Hookah use has become popular among adults, especially young students. And this is an important issue for the future of society. More knowledge is needed to examine. This study determination the use of hookah with using the extended parallel process model. Among student in Mashhad has been. This study was a cross sectional research on male student in Mashhad on bachelor education in 91-92 ac...

متن کامل

Introducing of Dirichlet process prior in the Nonparametric Bayesian models frame work

Statistical models are utilized to learn about the mechanism that the data are generating from it. Often it is assumed that the random variables y_i,i=1,…,n ,are samples from the probability distribution F which is belong to a parametric distributions class. However, in practice, a parametric model may be inappropriate to describe the data. In this settings, the parametric assumption could be r...

متن کامل

Hybrid Parallel Inference for Hierarchical Dirichlet Processes

The hierarchical Dirichlet process (HDP) can provide a nonparametric prior for a mixture model with grouped data, where mixture components are shared across groups. However, the computational cost is generally very high in terms of both time and space complexity. Therefore, developing a method for fast inference of HDP remains a challenge. In this paper, we assume a symmetric multiprocessing (S...

متن کامل

Parallel Markov Chain Monte Carlo for Dirichlet Process Mixtures

The Dirichlet process (DP) is a fundamental mathematical tool for Bayesian nonparametric modeling, and is widely used in tasks such as density estimation, natural language processing, and time series modeling. In most applications, however, the Dirichlet process requires approximate inference to be performed with variational methods or Markov chain Monte Carlo (MCMC). MCMC provides a “gold stan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014